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(54) Image interpolating method and apparatus 

(57) An image inputting unit inputs a first image and 
a second image. A matching processor (14) computes 
the pixel matching between those images, so that a cor- 
responding point is obtained on the second image with 
respect to lattice points of the mesh taken on the first 
image. A result thereof is recorded as a corresponding 



point file. An intermediate image generator (18) gener- 
ates an intermediate image of the first image and the 
second image, based on the corresponding point file. 
Color difference data on a pair of corresponding points 
of the first image and second image are inputted in the 
corresponding point file in advance. 
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Description 

£0001] The present invention relates to an image interpolation technique, and it particularly (though not exclusively) 
relates to method and apparatus for interpolating two images based on the matching technique 
[0002] A very large number of users have come to get connected to the Internet utilizing portable telephones. Besides 
the normal telephone call usage, the portable telephone is mainly used forthe Web services and electronic mail services 
via the Internet In particular, to browse information, which can be transmitted and received on a text basis, such as 
the timetable and stock pnce. as well as to browse the Web pages specialized in the portable telephone use is becoming 
a typical usage of the portable telephones. 
10 [0003] On the other hand, the color LCD is recently employed in the display unit of the portable telephone, and 
d.stnbut.on of motion pictures in which relatively simple images are dominant is started. Thus, the motion pictures 
combined with normal texts can be utilized on the portable telephone, so that the portable telephones prepared primarily 

°^ Sa9e J* ° UtSet are n0W becomin 9 to estab "sh a position as the first developed wearable computers 
[0004] However, the merchandise value of the portable telephones lies primarily in their light weight, long battery 
longevity, economica hardware, smooth operability and so forth. Thus, taking long time in downloading heavy image 
powe'r^ns^med ° Ver ' the CPU P ° Werto process SUQh heavv ima 9 e data is disadvantageous in terms of the 

[0005] Various respective aspects and features of the invention are defined in the appended claims. Features from 
20 ZST^o^Z Sis 6 9 ° mbined ^ fea,UreS ° f ^ independent claims as not merely as 

[0006] The present invention has been made in view of the foregoing circumstances and embodiments thereof can 
provide an image interpolation technique by which the motion pictures can be generated and displayed based on a 
small amount of image data. 

[0007] The present invention relates to an image interpolation technique. This technique can utilize the image match- 
ing technique (referred to as the "base technology" hereinafter) proposed in the Japanese Patent No. 2927350 owned 
by the same assignee of the present patent application. 

[0008] An embodiment according to the present invention relates to an image interpolation method at an encoding 
end. This method is a coding method to generate data which are used for interpolating two images (that are a first 
image and a second image, hereinafter), and the method includes: 

acquiring a first image and a second image; and computing a matching between the first image and the second image 
acquired, and detecting points which correspond between the images, so as to generate a corresponding point file 
wherein, in addition to positional information on the corresponding points, difference data on pixel values of the corre^ 
sponding points are stored in the corresponding point file. It is to be noted that "points" and "pixels" will be used inter- 
changeably, hereinafter, so that the two are not distinct. Moreover, when called a "point", the point may have area 
namely, it may be a region. A "pixel value" is also referred to as "color" hereinafter, however, is not limited to the color 
only and may be any arbitrary attribute. 

[0009] For example, suppose, as a result of the matching, that it is known that a point p, (x« . y n ) of the first image 
corresponds to a point p 2 (x 2 , y 2 ). Moreover, suppose that the colors of these points are v, and v 2 . respectively In this 
case, an intermediate image of the first image and second image can be generated by interpolating the coordinates 
of these points. Namely, in the intermediate image, the position of a point (hereinafter referred to as an interpolation 
point) moved from the point p., is expressed in the following formula. 

((1-t) Xl + tx 2 . (1-t) yi + ty 2 ) 

[001 0] On the other hand, in the base technology the color of the interpolation point is set to 

(1-t) Vl + tv 2 

by interpolating the color v t of the point Pl of the first image and the color v 2 of the point p 2 of the second image in the 
similar manner. Namely, in the base technology both the first image and the second image are referred to in the course 
of interpolating the color. 

[001 1] In contrast thereto, in the place of referring to the second image for the purpose of interpolating the color 
difference data of colors of points (the points being also referred to as a pair of corresponding points, hereinafter) which 
correspond between the first image and the second image are, in advance, incorporated into the corresponding point 
file in the present invention. Thus, there is no need of referring to the second image. Specifically speaking when the 
difference data is defined to be Av=v 2 - Vl , the colors of the above-described point Pl of the first image and the point p 2 
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of the second image can be described, in the interpolation point on the intermediate image, as 

[001 2] The interpolation technique according to the present invention is advantageous in usage efficiency of a mem- 
ory, the transmission band of the image data and the streaming of processings in the sense that there is no need of 
acquiring in advance the second image which is to be reproduced (in the temporal order) after the intermediate image 
and storing it in the memory. Moreover, since the pair of corresponding points are naturally expected to have color 
close to each other, the difference data can be generally expressed by relatively small number of bits. Thus, a data 
amount of the difference data is generally smaller than that of the second image. Moreover, the difference data usually 
shows a statistical bias with a center at zero, so it is preferable that the difference data be entropy-coded and thereafter 
stored in the corresponding point file. Thereby, further increased effect on the data compression is obtained. 
[0013] Another embodiment of the present invention relates to an image interpolation method at a decoding end. 
This method includes: acquiring a corresponding point file which describes a matching result of a first image and a 
second image; and generating an intermediate image of the first image and the second image by performing interpo- 
lation thereon based on the corresponding point file, wherein the corresponding point file includes: positional information 
on points which correspond between the first image and the second image; and difference data of pixel values thereof; 
and wherein, in said generating, the intermediate image is generated based on the first image, the positional information 
and the difference data. In other words, the second image needs not be referred to at a stage of generating the inter- 
mediate image. Here, what is meant by the "first image" being utilized is that a representative of the first image and 
the second image is utilized. Thus, only one of the images suffices. The interpolation may be of a linear or nonlinear type. 
[0014] Still another embodiment relates to an image interpolation apparatus at an encoding end. This apparatus is 
a coding apparatus which generates data for interpolating the images, and includes: an image input unit which acquires 
a first image and a second image; and a matching processor which computes a matching between the first image and 
the second image thus acquired and which generates a corresponding point file by detecting points that correspond 
between the images, wherein, in addition to positional information on the points that correspond between the images, 
difference data of pixel values thereof are incorporated in the corresponding point file by the matching processor. 
[0015] The matching processor may detect points on the second image that corresponds to lattice points of a mesh 
provided on the first image, and based on a thus detected result a destination polygon corresponding to the second 
image may be defined on a source polygon that constitutes the mesh on the first image. 

[0016] The matching processor may perform a pixel-by-pixel matching computation based on correspondence be- 
tween a critical point detected through a two-dimensional search on the first image and a critical point detected through 
a two-dimensional search on the second image. The matching processor may multiresolutinalize the first image and 
the second image by respectively extracting the critical points, then may perform a pixel-by-pixel matching computation 
between same multiresolution levels, and may acquire a pixel-by-pixel correspondence relation in a most fine level of 
resolution at a final stage while inheriting a result of the pixel-by-pixel matching computation to a matching computation 
in a different multi resolution level. 

[0017] Here, the matching method utilizing the critical points is an application of the base technology. However, the 
base technology does not touch on the features of the present invention relating to the lattice points or the polygons 
determined thereby. Introduction of a sort of such the simplified technique as the polygons in the present invention 
makes possible the significant reduction of the size of the corresponding point. 

[0018] Namely, in a case where the first and second images have n x m pixels respectively, there are caused (n x 
m) 2 combinations if their pixel-by-pixel correspondence is described as it is, so that the size of the corresponding point 
file will become extremely large. However, instead, this correspondence is modified by describing the correspondence 
relation between the lattice points or, substantially equivalent!^ the correspondence relation between polygons deter- 
mined by the lattice points, so that the data amount is reduced significantly! Storage of the first or second image and 
the corresponding point file achieves reproduction of the motion pictures, thereby significantly improved effect is 
achieved in the transmission, storage and so forth of the motion pictures. 

[0019] Still another embodiment of the present invention relates also to an image interpolation apparatus at a de- 
coding end. This apparatus includes: a communication unit which acquires a corresponding point file which describes 
a matching result of a first image arid a second image; and an intermediate image generator which generates an 
intermediate image of the first image and the second image by performing interpolation thereon based on the corre- 
sponding point file, wherein the corresponding point file includes: positional information on points which correspond 
between the first image and the second image; and difference data of pixel values thereof, and wherein the intermediate 
image generator generates the intermediate image based on either one of the first image or the second image, and 
the positional information and the difference data. 

[0020] Moreover, the image interpolation apparatus may further include a display unit which displays at least the 
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intermediate image. The apparatus may further include a corresponding point file storage which records in a manner 
such that the corresponding point file is associated to the first image. The intermediate image generator may be such 
that a point aimed within the first image is moved according to the positional information, and a position and pixel value 
of a point which corresponds to the aimed point may be determined in the intermediate image by also varying the pixel 
value of the aimed point based on the difference data. 

[0021] Still another embodiment relates also to an image interpolation method. This method includes: acquiring a 
matching result computed between a first image and a second image; and generating an intermediate image of the 
first image and second image without referring to the second image at this stage, by acting the matching result upon 
the first image and thus by varying position and value of pixels included in the first image. 

[0022] It is to be noted that the base technology is not prerequisite for the present invention. Moreover, any arbitrary 
replacement or substitution of the above-described structural components and the steps, expressions replaced or 
substituted in part or whole between a method and an apparatus as well as addition thereof, and expressions changed 
to a computer program, recording medium or the like are all effective as and encompassed by the present invention. 
[0023] Moreover, this summary of the invention does not necessarily describe all necessarily features so that the 
invention may also be sub-combination of these described features. 

[0024] The invention will be now described by way of example with reference to the accompanying drawings, through- 
out which like parts are referred to by like references, and in which: 



Fig. 1(a) is an image obtained as a result of the application of an averaging filter to a human facial image. 
Fig. 1(b) is an image obtained as a result of the application of an averaging filter to another human facial image. 
Fig. 1(c) is an image of a human face at p( 5 .°) obtained in a preferred embodiment in the base technology. 
Fig. 1(d) is another image of a human face at p<5.0) obtained in a preferred embodiment in the base technology. 
Fig. 1(e) is an image of a human face at p< 5 ' 1 ) obtained in a preferred embodiment in the base technology. 
Fig. 1(f) is another image of a human face at p(5.D obtained in a preferred embodiment in the base technology. 
Fig. 1(g) is an image of a human face at p(*> 2 ) obtained in a preferred embodiment in the base technology. 
Fig. 1(h) is another image of a human face at p<W) obtained in a preferred embodiment in the base technology. 
Fig. 1(i) is an image of a human face at p(5.3) obtained in a preferred embodiment in the base technology. 
Fig. 10) is another image of a human face at p< 5 . 3 > obtained in a preferred embodiment in the base technology 
Fig. 2(R) shows an original quadrilateral. 
Fig. 2(A) shows an inherited quadrilateral. 
Fig. 2(B) shows an inherited quadrilateral. 
Fig. 2(C) shows an inherited quadrilateral. 
Fig. 2(D) shows an inherited quadrilateral. 
Fig. 2(E) shows an inherited quadrilateral. 

Fig. 3 is a diagram showing the relationship between a source image and a destination image and that between 
the m-th level and the (m-1)th level, using a quadrilateral. 

Fig. 4 shows the relationship between a parameter ti (represented by x-axis) and energy C, (represented by y-axis). 
Fig. 5(a) is a diagram illustrating determination of whether or not the mapping for a certain point satisfies the 
bijectivity condition through the outer product computation. 

Fig. 5(b) is a diagram illustrating determination of whether or not the mapping for a certain point satisfies the 
bijectivity condition through the outer product computation. 

Fig. 6 is a flowchart of the entire procedure of a preferred embodiment in the base technology. 
Fig. 7 is a flowchart showing the details of the process at S1 in Fig. 6. 
Fig. 8 is a flowchart showing the details of the process at S10 in Fig. 7. 

Fig. 9 is a diagram showing correspondence between partial images of the m-th and (m-1)th levels of resolution. 

Fig. 10 is a diagram showing source images generated in the embodiment In the base technology. 

Fig. 11 is a flowchart of a preparation procedure for S2 in Fig. 6. 

Fig. 12 is a flowchart showing the details of the process at S2 in Fig. 6. 

Fig. 13 is a diagram showing the way a submapping is determined at the 0-th level. 

Fig. 14 is a diagram showing the way.a submapping is determined at the first level. 

Fig. 15 is a flowchart showing the details of the process at S21 in Fig. 6. 

Fig. 16 is a graph showing the behavior of energy d m - s) corresponding to «m.s) (\= iA x) which has been obtained 
for a certain f< m » s ) while changing X. 

Fig. 17 is a diagram showing the behavior of energy d n) corresponding to m (ri=/Ari)(/=0,1,...) which has been 
obtained while changing r\. 

Fig. 18 shows how certain pixels correspond between the first image and the second image. 

Fig. 19 shows a correspondence relation between a source polygon taken on the first image and a destination 

polygon taken on the second image. 
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Fig. 20 shows a procedure by which to obtain points in the destination polygon corresponding to points in the 
source polygon. 

Fig. 21 is a flowchart showing a procedure for generating the corresponding point file according to a present 
embodiment. 

Fig. 22 is a flowchart showing a procedure for generating an intermediate image based on the corresponding point 
file. 

Fig. 23 shows a structure of an image interpolation apparatus according to a present embodiment. 

[0025] The invention will now be described based on the preferred embodiments, which do not intend to limit the 
scope of the present invention, but exemplify the invention. All of the features and the combinations thereof described 
in the embodiment are not necessarily essential to the invention. 

[0026] At first, the multiresolutional critical point filter technology and the image matching processing using the tech- 
nology, both of which will be utilized in the preferred embodiments, will be described in detail as "Base Technology". 
NamelyTtKe following sections of [1] and [2] belong to the base technology, where [1] describes elemental techniques 
and [2] describes a processing procedure. These techniques are patented under Japanese Patent No. 2927350 and 
owned by the same assignee of the present invention, and they realize an optimal achievement when combined with 
the present invention. According to the present embodiments, there is provided a mesh on the image, so that lattice 
points thereof represent a plurality of pixels. Thus, application efficiency for such the pixel-by-pixel matching technique 
as in the base technology is naturally high. However, it is to be noted that the image matching techniques which can 
be adopted in the present embodiments are not limited to this. In Figs. 18 to 23, image interpolation techniques utilizing 
the base technology will be described in a specific manner. 

Base Technology 

[1] Detailed description of elemental techniques 
[1.11 Introduction 

[0027] Using a set of new multiresolutional filters called critical point filters, image matching is accurately computed. 
There is no need for any prior knowledge concerning objects in question. The matching of the images is computed at 
each resolution while proceeding through the resolution hierarchy. The resolution hierarchy proceeds from a coarse 
level to a fine level. Parameters necessary for the computation are set completely automatically by dynamical compu- 
tation analogous to human visual systems. Thus, There is no need to manually specify the correspondence of points 
between the images. 

[0028] The base technology can be applied to, for instance, completely automated morphing, object recognition, 
stereo photogrammetry, volume rendering, smooth generation of motion images -from a small number of frames. When 
applied to the morphing, given images can be automatically transformed. When applied to the volume rendering, in- 
termediate images between cross sections can be accurately reconstructed, even when the distance between them 
is rather long and the cross sections vary widely in shape. 

[1.2] The hierarchy of the critical point filters 

[0029] The multiresolutional filters according to the base technology can preserve the intensity and locations of each 
critical point included in the images while reducing the resolution. Now, let the width of the image be N and the height 
of the image be M. For simplicity, assume that N=M=2n where n is a positive integer. An interval [0, N] <z R is denoted 
by I. A pixel of the image at position (i, j) is denoted by pW) where i,j G I. 

[0030] Here, a multiresolutional hierarchy is introduced. Hierarchized image groups are produced by a multiresolu- 
tional filter. The multiresolutional filter carries out a two dimensional search on an original image and detects critical 
points therefrom. The multiresolutinal filter then extracts the critical points from the original image to construct another 
image having a lower resolution. Here, the size of each of the respective images of the m-ih level is denoted as 2 m X2 m 
(0<m<n). A critical point filter constructs the following four new hierarchical images recursively, in the direction de- 
scending from n. 

lm,0) _ - , (m+1,0) „(m+1,0) v /„(/n+1,0) _(m+1.0) 

P(iJ) = max ( min (P<2/.2/) .P(2/,2y-H))» m,n (P(2/+1.2/)'P(2/+1.2y*1))) 
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P& 2 > = min (m ax(pS 2 ) pg^,)^^^^,)) 

where let 

rt (n,0) _ fn.1) (n,2) (n.3) 

[0031] The above four images are referred to as subimages hereinafter. When min x5to(+1 and max xflSK+1 are abbre- 
viated to and a and p, respectively, the subimages can be expressed as follows. 



15 


p""- 0) = a(*)o(y)p ( '"* 1 - 0) 




P (m ' 1) = <x(x)p(y)p (m+1 ' 1 > 


20 






P (m ' 2) = P(x)a(y)p (m+1 - 2) 




P (m ' 2) = P(*)P(y)P (m+13> 



[0032] Namely, they can be considered analogous to the tensor products of a and p. The subimages correspond to 
the respective critical points. As is apparent from the above equations, the critical point filter detects a critical point of 
the original image for every block consisting of 2 X 2 pixels. In this detection, a point having a maximum pixel value 
and a point having a minimum pixel value are searched with respect to two directions, namely, vertical and horizontal 
directions, in each block. Although pixel intensity is used as a pixel value in this base technology, various other values 
relating to the image may be used. A pixel having the maximum pixel values for the two directions, one having minimum 
pixel values for the two directions, and one having a minimum pixel value for one direction and a maximum pixel value 
for the other direction are detected as a local maximum point, a local minimum point, and a saddle point, respectively. 
[0033] By using the critical point filter, an image (1 pixel here) of a critical point detected inside each of the respective 
blocks serves to represent its block image (4 pixels here). Thus, resolution of the image is reduced. From a singularity 
theoretical point of view, cc(x) oc(y) preserves the local minimum point(minima point), p(x)P(y) preserves the local max- 
imum point(maxima point), cc(x)P(y) and p(x)a(y) preserve the saddle point. 

[0034] At the beginning, a critical point filtering process is applied separately to a source image and a destination 
image which are to be matching-computed. Thus, a series of image groups, namely, source hierarchical images and 
destination hierarchical images are generated. Four source hierarchical images and four destination hierarchical im- 
ages are generated corresponding to the types of the critical points. 

[0035] Thereafter, the source hierarchical images and the destination hierarchical images are matched in a series 
of the resolution levels. First, the minima points are matched using p(m.o). Next, the saddle points are matched using 
p(m,i) based on previous matching result for the minima points. Other saddle points are matched using p<™.2) 
45 Finally, the maxima points are matched using p(™.3). 

[0036] Figs. 1(c) and 1(d) show the subimages p<5.0) Q f the images in Figs. 1(a) and 1(b), respectively. Similarly 
Figs. 1(e) and 1(f) show the subimages p(5.i). Figs. 1(g) and 1(h) show the subimages p<5.2). Figs. 1(i) and 1(j) show 
the subimages p(5.3). Characteristic parts in the images can be easily matched using subimages. The eyes can be 
matched by p( 5 -°> since the eyes are the minima points of pixel intensity in a face. The mouths can be matched by p( 51 ) 
since the mouths have low intensity in the horizontal direction. Vertical lines on the both sides of the necks become 
clear by p(5.2). The ears and bright parts of cheeks become clear by p< 5 .3> since these are the maxima points of pixel 
intensity. 

[0037] As described above, the characteristics of an image can be extracted by the critical point filter. Thus, by 
comparing, for example, the characteristics of an image shot by a camera and with the characteristics of several objects 
5 5 recorded in advance, an object shot by the camera can be identified. 
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h.31 Computation of mapping between images 

[0038] The pixel of the source image at the location (ij) is denoted by p {n) and that of the destination image at (k f i) 
is denoted by q {n) where i, j, k, I £ I. The energy of the mapping between tftS images (described later) is then defined. 

5 This energy is d'&fermined by the difference in the intensity of the pixel of the source image and its corresponding pixel 
of the destination image and the smoothness of the mapping. First, the mapping ftm.o) :p (m,o) q (m,0) between p< m .o) 
and q( m °) with the minimum energy is computed. Based on f* m '°), the mapping f< m ' 1 > between p< m « 1 ) and q< m - 1 > with the 
minimum energy is computed. This process continues until ft m « 3 ) between p( m » 3 > and q( m - 3 ) is computed. Each fKO (i 
= 0,1,2,...) is referred to as a submapping. The order of i will be rearranged as shown in the following (3) in computing 

w f<m.O for the reasons to be described later. 



« where a(i)€E{0, 1.2,3}. 
[1. 3. 1] Bijectivity 

[0039] When the matching between a source image and a destination image is expressed by means of a mapping, 
20 that mapping shall satisfy the Bijectivity Conditions (BC) between the two images (note that a one-to-one surjective 
mapping is called a bijection). This is because the respective images should be connected satisfying both surjection 
and injection, and there is no conceptual supremacy existing between these images. It is to be to be noted that the 
mappings to be constructed here are the digital version of the bijection. In the base technology, a pixel is specified by 
a grid point. 

25 [0040] The mapping of the source subimage (a subimage of a source image) to the destination subimage (a subimage 
of a destination image) is represented by f* m - s ): l/2 n * m X M2™ -» l/2 n ~ m X 1/2™ (s = 0,1,...), where i m,s) =(k,l) means 
that p^ m ' s) of the source image is mapped to g (m,s) of the destination image. For simplicity, when f((,^(k,l) holds, a 
pixel is denoted by q f(lJ) . (lc '° 

[0041] When the data sets are discrete as image pixels (grid points) treated in the base technology, the definition of 
30 bijectivity is important. Here, the bijection will be defined in the following manner, where i,i\j ,j\k and I are all integers. 
First, each square region (4) 
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(m.s) (m.s) (m.s) (m.s) 

on the source image plane denoted by R is considered, where i = 0, .... 2 m -1, and j = 0, ... f 2 m -1. The edges of R are 
directed as follows. 



[0042] This square will be mapped by f to a quadrilateral on the destination image plane. The quadrilateral (6) 

(m,s) (m,s) (m.s) (m,s) ,~ 

denoted by fl m » s )(R) should satisfy the following bijectivity condrtions(BC). 

/o 0 Am.shn-An.s), (m,s) (m,s) (m,$) n (m.s) _ (m,s) (m,$) (m,s) (m,s) 



1. The edges of the quadrilateral ft m - s )(R) should not intersect one another. 

2. The orientation of the edges of ft m -s)(R) should be the same as that of R (clockwise in the case of Fig. 2). 
55 3. As a relaxed condition, retraction mapping is allowed. 

[0043] The bijectivity conditions stated above shall be simply referred to as BC hereinafter. 



7 



EP 1 209 619 A2 

[0044] Without a certain type of a relaxed condition, there would be no mappings which completely satisfy the BC 
other than a frivol identity mapping. Here, the length of a single edge of «"i.s) (R) may 5e 2ero Name| #JL~ 
be a tnangle However it is not allowed to be a point or a line segment having area zero. Specifically speaking, f Fig 
S 22« r °1 9in f ! qU f dn,a,era1, Fi9S - 2(A) 3nd 2(D) Satisfy BC whi,e Fi 9 s 2 < B >' a "d 2(E) do not satisfy BC 

s sunective Namefy. each pocel on the boundary of the source image is mapped to the pixel that occupies the same 
locations at the destination image. In other words. f(i,j)=(i,j) (on the four lines of i=0, i=2<«-1 , j=0. i=2"M). This "condition 
will be hereinafter referred to as an additional condition. condition 

10 [1. 3. 21 Energy of mapping 

[1. 3. 2. 11 Cost related to the pixel intensity 

15 mi^mL^H ° f Xt ] e ™ p ? ina f is defined - ob i e <* v « ^re is to search a mapping whose energy becomes 

« minimum. The energy .s determ.ned mainly by the difference in the intensity of between the pixel of the source image 

oylheS^ 

^npjs^-vcgKsfV (7) 

where vtp<^) and v^*) are the intensity values of the pixels p (ms > and o (m - s) . respectively The total enerav 
^offis^atching evaluation equation, and can be defined as the s(Jm of rf^^^ 
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[1. 3. 2. 21 Cost relate d to the locations of the pixel for smooth mapping 




D (U) -^OVJ) +E 1(U) (9) 
where the coefficient parameter t\ which is equal to or greater than 0 is a real number. And we have 



45 



>£? = ZZ l(f™v./)-ih/))-<f*"v.n-v.f))f/4 — (id 

so r-/-i/\«/-i » 

where ||(x.y)||=V* 2 +y 2 - (12) and f(i\j') is defined to be zero for i'<o and j'<0. E 0 is determined by the distance between 
(ij) and f(i.j). E 0 prevents a pixel from being mapped to a pixel too far away from it. However. E 0 will be replaced later 
by another energy function. E, ensures the smoothness of the mapping. E, represents a distance between the dis- 
placement of p(i.j) and the displacement of its neighboring points. Based on the above consideration, another evaluation 
equation for evaluating the matching, or the energy D, is determined by the following equation (13). 
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[1. 3. 2. 31 Total energy of the mapping 

[0048] The total energy of the mapping, that is, a combined evaluation equation which relates to the combination of 
10 a plurality of evaluations, is defined as \C {m ' s) +D (nvs) , where X ^0 is a real number. The goal is to detect a state in 
which the combined evaluation equation hifePan extreme value, namely, to find a mapping which gives the minimum 
energy expressed by the following (14). 

15 mmlAC^ + l)^} (14) 

[0049] Care must be exercised in that the mapping becomes an identity mapping if X.=0 and t\=0 (i.e., fl m ' s )(ij)=0,j) 
for all i=0, 1 2 m -1 and j=0,1,...,2 rn -1). As will be described later, the mapping can be gradually modified or transformed 

20 from an identity mapping since the case of A,=0 and r|=0 is evaluated at the outset in the base technology. If the combined 
evaluation equation is defined as C* m ' s) +XD (m ' s) where the original position of X is changed as such, the equation with 
X=0 and ri=0 will be C (m ' s) oniy. As a result thereof, pixels would be randomly corresponded to each other only because 
their pixel intensities afe close, thus making the mapping totally meaningless. Transforming the mapping based on 
such a meaningless mapping makes no sense. Thus, the coefficient parameter is so determined that the identity map- 

25 pjng is initially selected for the evaluation as the best mapping. 

[0050] Similar to this base technology, the difference in the pixel intensity and smoothness is considered in the optical 
flow technique. However, the optical flow technique cannot be used for image transformation since the optical flow 
technique takes into account only the local movement of an object. Global correspondence can be detected by utilizing 
the critical point filter according to the base technology. 

30 

f1. 3. 3] Determining the mapping with multiresolution 

[0051] A mapping f mIn which gives the minimum energy and satisfies the BC is searched by using the multiresolution 
hierarchy. The mapping between the source subimage and the destination subimage at each level of the resolution is 

35 computed. Starting from the top of the resolution hierarchy (i.e., the coarsest level), the mapping is determined at each 
resolution level, while mappings at other level is being considered. The number of candidate mappings at each level 
is restricted by using the mappings at an upper (i.e., coarser) level of the hierarchy. More specifically speaking, in the 
course of determining a mapping at a certain level, the mapping obtained at the coarser level by one is imposed as a 
sort of constraint conditions. 

40 [0052] Now, when the following equation (15) holds, 



p (nM ' 5) and q {m '^' s) are respectively called the parents of p {m ' s) and q {m ' s) t . where |*j denotes the largest integer not 
exuding x. C&ttersely, p Cm ' s) and g (m ' s) are the child of fif' 9 * } and'<ne child of q (m ; 1 ' s) , respectively. A function 
parent (i j) is defined by the flowing (^. • {m 

50 

^renr(/,y) = |lJ||J) — (16) 

55 [0053] A mapping between p {m ' s) and q {m,s) is determined by computing the energy and finding the minimum thereof. 
. The value of fl m « s >(i j)=(k,l) is determined ^follows using f(m-1 ,s) (m=1 ,2,...,n). First of all, imposed is a condition that 
q {m - 8) should lie inside a quadrilateral defined by the following (17) and (18). Then, the applicable mappings are nar- 
rowed down by selecting ones that are thought to be reasonable or natural among them satisfying the BC. 



Q 
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o ( ?' s) . „( m .») „(m,s) (m,s) 

where 

g ln,,s) (IJ) m ^. S ) {paren(m + {«rU) (pgmnm + (1 1)} (18) 

[0054] The quadrilateral defined above is hereinafter referred to as the inherited quadrilateral of p (m ' s) The oixel 
minimizing the energy is sought and obtained inside the inherited quadrilateral <W ' pe ' 

K S 9 a n 3 d'D^ S i he , ab > e ^ 8SCribed Procures. The pixels A, B, C and D of the source image are mapped 
to A , B . C and D of the destination .mage, respectively, at the (m-1)th level in the hierarchy. The pixel p< m ' s > should 
be mapped to the pixel q , which exists inside the inherited quadrilateral A'B'C'DV Thereby, bridgirWVom the 
mapping at the (m-1)th le&l to4he mapping at the m-th level is achieved. onog.ng-rrom me 

[0056] The energy E 0 defined above is now replaced by the following (19) and (20) 

e m = ll^ m - 0 W m) </./>ll 2 (19) 



^-I^M-^WtfM (20 ) 
for computing the submapping K-".o) and the subma pp in g ^s) at the m-th level, respectively 

SmnLl" T m T ner amappin9 keeps low the ener 9V of all the submappings is obtained. Using the equation 
S.?n nSL 1 s " bmapp ! n 9 s corresponding to the different critical points associated to each other within the same 
level m order that the subimages can have high similarity. The equation (19) represents the distance between M 

J* ?r 6 0,J) Sh ° Uld 06 mapped re 9 ar ded as a part of a pixel at the (m-1 )the level 
0058] When there .s no pixel satisfying the BC inside the inherited quadrilateral A'B'C'D', the following steps are 
taken. First, pixels whose d.stance from the boundary of A'B'CD' is L (at first, L=1) are examined. If a pixel whose 
energy is the m.n.mum among them satisfies the BC. then this pixel will be selected as a value of i) L is increased 
untt such a pbcel is found or L reaches its upper bound L<<»> . L«»> is fixed for each level m. If no such a p xeMs^ound 
£51? n T MOn ° f BC ! S i9n ° red temp orarily m a a n x d sC8n mappings that caused the area of the traced 
found, then the first and the second conditions of the BC will be removed 

[0059] Multiresolution approximation is essential to determining the global correspondence of the images while pre- 
venting the mapping from being affected by small details of the images. Without the multiresolution approximation it 
is impossible to detect a correspondence between pixels whose distances are large. In the case where the muWre's- 

Z If n °l T!'f blB ' Si2S ° f a " ima9e Wi " be ,imited to the ver V sma " one. and only tiny changes 
in the images can be handled. Moreover, imposing smoothness on the mapping usually makes it difficult to find the 
correspondence of such pixels. That is because the energy of the mapping from one pixel to another pixel which is far 
therefrom .s high On the other hand, the multiresolution approximation enables finding the approximate correspond- 
ence of such p,xels. This is because the distance between the pixels is small at the upper (coarser) level of the hierarchy 
of the resolution. 7 

[1. 4] Auto matic determination of the optimal parameter values 

[0060] One of the main deficiencies of the existing image matching techniques lies in the difficulty of parameter 
adjustment. In most cases, the parameter adjustment is performed manually and it is extremely difficult to select the 
aSomaticaTy ^ t0 the baSe * echn ology, the optimal parameter values can be obtained completely 

[0061] The systems according to this base technology includes two parameters, namely, X and n, where X and n 
represent the weight of the difference of the pixel intensity and the stiffness of the mapping, respectively. The initial 
value for these parameters are 0. First, X is gradually increased from X=0 while n is fixed to 0. As X becomes larger 
and the value of the combined evaluation equation (equation (14)) is minimized, the value of d m - 8) for each submappinq 
generally becomes smaller. This basically means that the two images are matched better. However, if X exceeds the 
optimal value, the following phenomena (1 - 4) are caused. 
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1 . Pixels which should not be corresponded are erroneously corresponded only because their intensities are close. 

2. As a result, correspondence between images becomes inaccurate, and the mapping becomes invalid. 

3. As a result, D (m s) in the equation 14 tends to increase abruptly. 

4. As a result, sinke the value of the equation 14 tends to increase abruptly, t< m s > changes in order to suppress. 
5 the abrupt increase of D {m * f $) . As a result, c ( "' s) increases. 

[0062] Therefore, a threshold value at which C (m ' s) turns to an increase from a decrease is detected while a state in 
which the equation (14) takes the minimum value foth X being increased is kept. Such X is determined as the optimal 
value at ti=0. Then, the behavior of d m,s) is examined while r\ is increased gradually, and r\ will be automatically 
w determined by a method described later/x will be determined corresponding to such the automatically determined r\. 
[0063] The above-described method resembles the focusing mechanism of human visual systems. In the human 
visual systems, the images of the respective right eye and left eye are matched while moving one eye. When the 
objects are clearly recognized, the moving eye is fixed. 

* 

15 f1. 4. 1] Dynamic determination of X 

[0064] X is increased from 0 at a certain interval, and the a subimage is evaluated each time the value of X changes. 
As shown in the equation (14), the total energy is defined by xd m ' s) +Q {m ' s) . D {m ' s) in the equation (9) represents the 
smoothness and theoretically becomes minimum when it is the identity n*»appin$!%o and E 1 increase as the mapping 
20 j S further distorted. Since E 1 is an integer, 1 is the smallest step of D {m,s) . Thus, that changing the mapping reduces 
the total energy is impossible unless a changed amount (reduction amoGnt) of the current XC (m,s) is equal to or greater 
than 1. Since D {m,s) increases by more than 1 accompanied by the change of the mappin^he total energy is not 
reduced unless Xfc (m ' s > is reduced by more than 1 . 

[0065] Under this ( 'c&ndition, it is shown that C {m,s) decreases in normal cases as X increases. The histogram of 
25 C {m,s) is denoted as h(l), where h(l) is the number Arfyxels whose energy C {m ' s) is I 2 . In order that XI 2 ^1 , for example, 
th^'&se of l 2 =1/X is considered. When X varies from X, to X^ a number of pix£t§ (denoted A) expressed by the following 
(21) 



30 



35 



40 



45 



50 



•ftl 



<*1 V 



changes to a more stable state having the energy (22) which is 

c (m.5)_ / 2 =c (/n,s) 1 (22) 

[0066] Here, it is assumed that all the energy of these pixels is approximated to be zero. It means that the value of 

d m ' $) changes by (23). 
UJ) 

aC< m ' s) =-£ (23) 

As a result, the equation (24) holds. 

dC < =-HiH (24) 
dX x& 



Since h(l)>0 , c (m s) decreases in normal case. However, when X tends to exceed the optimal value, the above phe- 
nomenon that is Characterized by the increase in c (m ' 5) occurs. The optimal value of X is determined by detecting this 
55 phenomenon. f 
[0067] When 



11 
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AW-^-JH (25) 



is assumed where both H(h>0) and k are constants, the equation (26) holds. 

ar (m.s) • 
° c f _ H 

dX " x 5/2*k/2 (26) 

Then, if k*-3, the following (27) holds. 

(m.s)__. H 



cp°'=c+ 



(3/2 + / f /2)X 3/2+W2 (27) 



The equation (27) is a general equation of C (m ' s) (where C is a constant). 

[0068] When detecting the optimal value of i, the number of pixels violating the BC may be examined for safety In 
the course of determining a mapping for each pixel, the probability of violating the BC is assumed p 0 here. In that case 
since 



dA _ h(l) 

(28) 



holds, the number of pixels violating the BC increases at a rate of the equation (29). 



B o~-^T (29) 



Thus, 

fi n X3/2 



is a constant. If assumed that h(l)=HI*. the following (31), for example, 

n ,3/2+^2 

O 0 A, -p 0 H ( 31 ) 

becomes a constant. However, when X exceeds the optimal value, the above value of (31) increases abruptly Bv 
detecting this phenomenon, whether or not the value of fi 0 X3/2 + A/2 /2 m exceeds an abnormal value B 0#)ms exceeds is 
inspected, so that the optimal value of can be determined. Similarly, whether or not the value of B, xaanla^m exceeds 
an abnormal value fl 1fW so that the increasing rate B A of pixels violating the third condition of the BC is checked 
The reason why the fact 2™ is introduced here will be described at a later stage. This system is not sensitive to the two 
threshold values B om „ s and B mres . The two threshold values e 0ffTOS and B mms can be used to detect the excessive 
distortion of the mapping which is failed to be detected through the observation of the energy d m - a) . 
[0069] In the experimentation, the computation of *"».«> is stopped and then the computation of fO>.s+i) is started 
when X exceeded 0.1. That is because the computation of submappings is affected by the difference of mere 3 out of 
255 levels in the pixel intensity when X>0. 1 , and it is difficult to obtain a conect result when X>0. 1 . 

[1.4.21 Histogram hffl 

[0070] The examination of C^dbes not depend on the histogram h(l). The examination of the BC and its third 
condition may be affected by the h(l). k is usually close to 1 when (X , d m ; s) is actually plotted. In the experiment, k=1 
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is used, that is. SqA. 2 and e,X 2 are examined. If the true value of k is less than 1, 8„X 2 and e,X 2 does not become 
constants and increase gradually by the factor of Xt 1 -*)* 2 . If h(l) is a constant, the factor is, for example, X 1 ' 2 . However, 
such a difference can be absorbed by setting the threshold B omres appropriately. 

[0071] Let us model the source image by a circular object with its center at(X(,.y 0 ) and its radius r. given by: 



— c(V(/-x 0 ) a +0-^o) 2 )-(VO'-'o) J +U-y*? *r) 



0... {otherwise) 



— (32) 



and the destination image given by: 



q(',j) = 



— c(V('-^) 2 +0--y.) 2 )...(V('-*.) 2 +U-y0 2 



0.., (otherwise) 



— (33) 



with its center a\{x, .y^ and radius r. Let c(x) has the form of c(x)=x k . When the centers (xo,y 0 ) and {x, .y, ) are sufficiently 
far from each other, the histogram h(l) is then in the form of: 



h(f) oc rf(k * 0) 



(34) 



[0072] When k=1 f the images represent objects with clear boundaries embedded in the backgrounds. These objects 
become darker toward their centers and brighter toward their boundaries. When k=-1. the images represent objects 
with vague boundaries. These objects are brightest at their centers, and become darker toward boundaries. Without 
much loss of generality, it suffices to state that objects in general are between these two types of objects. Thus, k such 
that -1^k^1 can cover the most cases, and it is guaranteed that the equation (27) is generally a decreasing function. 
[0073] As can be observed from the above equation (34), attention must be directed to the fact that r is influenced 
by the resolution of the image, namely, r is proportional to 2 m . That is why the factor 2 m was introduced in the above 
section [1.4.1]. 

[1. 4. 3] Dynamic determination of r\ 

[0074] The parameter r\ can also be automatically determined in the same manner. Initially, x\ is set to zero, and the 
final mapping K n ) and the energy d n) at the finest resolution are computed. Then, after r\ is increased by a certain 
value At\ and the final mapping f< n > ^nd the energy C (n) at the finest resolution are again computed. This process is 
repeated until the optimal value is obtained. r| represents the stiffness of the mapping because it is a weight of the 
following equation (35). 



[0075] When n is zero, D (n) is determined irrespective of the previous submapping, and the present submapping 
would be elastically deformefi and become too distorted. On the other hand, when is a very large value, D n is almost 
completely determined by the immediately previous submapping. The submappings are then very stiff, and the pixels 
are mapped to almost the same locations. The resulting mapping is therefore the identity mapping. When the value of 
T| increases from 0, C* n) gradually decreases as will be described later. However, when the value of r\ exceeds the 
optimal value, the energy starts increasing as shown in Fig. 4. In Fig. 4, the x-axis represents^, and y-axis represents C f . 
[0076] The optimum value of t\ which minimizes C {n) can be obtained in this manner. However, since various elements 
affects the computation compared to the case of x! C (n) changes while slightly fluctuating. This difference is caused 
because a submapping is re-computed once in the case of X whenever an input changes slightly, whereas all the 
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submappings must be re-computed in the case of n . Thus, whether the obtained value of C (n) is the minimum or not 
cannot be judged I instantly When candidates for the minimum value are found, the true minimum needs to be searched 
by setting up further finer interval. 

[1. 51 Supe'rsampling 

nf W rirn d , eCi K n9 x the ^f 0 "* 6 " 06 between th * P™*- the range of can be expanded to R X R (R being 
me set of real numbers) in order to increase the degree of freedom. In this case, the intensity of the pixels of the 
destination image is interpolated, so that having the intensity at non-integer points 

(36) 

intege^^ue^Tnd' SUperSamp " ng is P erformed - actual implementation. ft-"*) is allowed to take jntege( . gnd hg(f 



^(</H0.5.o.5)) (37) 



is given by 



(W?S,T )+ V«7(W(1.1))V2 (38) 
f1. 61 Normalization of the pixel intensity of each image 

[0078] When the source and destination images contain quite different objects, the raw pixel intensity may not be 
used to compute the mapping because a large difference in the pixelintensity causes excessively large enerqv d m s) re- 
lating the intensity, thus making it difficult to perform the correct evaluation. f 
onZUKS™ ®f a !" ple : the mat <*ing between a human face and a cat's face is computed as shown in Figs. 20(a) and 
20(b) The cats face is covered with hair and is a mixture of very bright pixels and very dark pixels. In this case in 
order to compute > the submappings of the two faces, its subimages are normalized. Namely, the darkest pixel intensity 
is set to 0 while the bnghtest pixel intensity is set to 255. and other pixel intensity values are obtained using the linear 
interpolation. " 

[1. 7] Implementation 

[0080] In the implementation, utilized is a heuristic method where the computation proceeds linearly as the source 
image is scanned. First, the value of ff".s) is determined at the top leftmost pixel (i j)=(0.0). The value of each *"».«>(! 
j) is then determined while I is increased by one at each step. When i reaches the width of the image, j is increased bv 
one and II is reset to zero. Thereafter. *"W)(i,j) is determined while scanning the source image. Once pixel correspond- 
ence is determined for all the points, it means that a single mapping K">.s) is determined 

[0081] When a corresponding point is determined for p (IJ) . a corresponding point %t| of p (i J+1) is determined 
next. The position of q^, is constrained by the position of q f(iJ) since the position of qJ' satisfies the BC Thus 
" I 5 i y l 6 ™ 3 POmt Wh ° Se corres P° ndin 9 Point is determined earlier is given higher priority. If the situation continues 
in which (0,0) is always given the highest priority, the final mapping might be unnecessarily biased. In order to avoid 
this bias, t ms > is determined in the following manner in the base technology. 

[0082] First, when (s mod 4) is 0, js determined starting from (0,0) while gradually increasing both i and j When 
(s mod 4) is 1. it is determined starting from the top rightmost location while decreasing i and increasing j When (s 
mod 4) is 2, it is determined starting from the bottom rightmost location while decreasing both i and j When (s mod 4) 
is 3. it is determined starting from the bottom leftmost location while increasing I and decreasing j. Since a concept 
such as the submapping, that is, a parameter s. does not exist in the finest n-th level, *•*»> is computed continuously 
in two directions on the assumption that s=0 and s=2. 

[0083] In the actual implementation, the values of f(m. S )(i,j) (m=0 n) that satisfy the BC are chosen as much as 

possible, from the candidates (k,l) by awarding a penalty to the candidates violating the BC. The energy D, k » of the 
candidate that violates the third condition of the BC is multiplied by $ and that of a candidate that violates the first or 
second condition of the BC is multiplied by f In the actual implementation, <|>=2 and <fr=1 00000 are used 



44 
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[0084] In order to check the above-mentioned BC, the following test is performed as the actual procedure when 
determining (k,l)=f< m ' s )(i j). Namely, for each grid point (k,l) in the inherited quadrilateral of fl m ' s >(i.j), whether or not the 
z-component of the outer product of 

W=XxS (39) 

is equal to or greater than 0 is examined, where 



10 



A = n^ m ' s) n ( m **) (4fn 



Here, the vectors are regarded as 3D vectors and the z-axis is defined in the orthogonal right-hand coordinate system. 
When W is negative, the candidate is awarded a penalty by multiplying D* m ' s) by <|> so as not to be selected as much 

2 o as possible. {Kf) 

[0085] Figs. 5(a) and 5(b) illustrate the reason why this condition is inspected. Fig. 5(a) shows a candidate without 
a penalty and Fig. 5(b) shows one with a penalty. When determining the mapping f( m » s )(i,j+1) for the adjacent pixel at 
(i,j+1), there is no pixel on the source image plane that satisfies the BC if the z-component of W is negative because 
then Q (/n,s> passes the boundary of the adjacent quadrilateral. 

25 (*0 

[1. 7. 1] The order of submappings 

[0086] In the actual implementation, a(0)=0, o(1)=1, o(2)=2, a (3)=3, o (4)=0 were used when the resolution level 
was even, while a(0)=3, o(1)=2, a(2)=1, a(3)=0, q(4)=3 were used when the resolution level was odd. Thus, the sub- 
mappings are shuffled in an approximately manner. It is to be noted that the submapping is primarily of four types, and 
s may be any one among 0 to 3. However, a processing with s=4 was actually performed for the reason described later. 

[1. 8] Interpolations 

[0087] After the mapping between the source and destination images is determined, the intensity values of the cor- 
responding pixels are interpolated. In the implementation, trilinear interpolation is used. Suppose that a square 
P(U)P(i+ij)P(i+i.J*i)P(ij+i) on the so, Jrce image plane is mapped to a quadrilateral q f(lJ) q fp+1 j) q f(l+1 J+1) q f(IJ+1) on the des- 
tination image plane. For simplicity, the distance between the image planes is assumed 1. The intermediate image 
pixels r(x,y,t) (0^x^N-1, 0^y^M-1) whose distance from the source image plane is t (0^t^1) are obtained as follows. 
First, the location of the pixel r(x,y,t), where x,y,teR, is determined by the equation (42). 

(x,y) = (1-dx)(1-dy)(1-f)(/j) + (1-rfx)(1-dy)tf(/J) 
+ dx(1-dy)(1-t)(i+1J) + cfx(1-dy)tf(/+1J) 
+ (1-c/x)c^(1-0(/j+1)+(1-(/x)c/yrf(/j+1) 

+ c/xdy(1-r)(/+1 J+1)+dxdyff(/+1 J+1) (42) 

50 

The value of the pixel intensity at r(x t y,t) is then determined by the equation (43). 

V(r{x.y,t)) = (1 - dx)(1 - tfy)(1 - fWp^) + (1 - c/x)(1 - dy)tV{q f{SJ) ) 
+ - <M<1 - 0 V(p (/+1 ^) + dx(1 - dyWiq^rf 
+ (1 - c/x)c/y(1 - 0 V(p (/J+1) ) + (1 - dx)dytV(q f{iJ ^ ) ) 
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* dX( W - 0 V(P (/+U+1) ) + dxdytV(q f{U , M) ) (43 ) 

where dx and dy are parameters varying from 0 to 1. 
[1.9] Mapping on which constraints are imposed 

[0088] So far, the determination of the mapping to which no constraint is imposed has been described. However 
when a correspondence between particular pixels of the source and destination images is provided in a predetermined 
manner, the mapping can be determined using such correspondence as a constraint. 

[0089] The basic idea is that the source image is roughly deformed by an approximate mapping which maps the 
specified pixels of the source image to the specified pixels of the destination images and thereafter a mapping f is 
accurately computed. 

[0090] First, the specified pixels of the source image are mapped to the specified pixels of the destination image 
then the approximate mapping that maps other pixels of the source image to appropriate locations are determined. In 
other words, the mapping is such that pixels in the vicinity of the specified pixels are mapped to the locations near the 
position to which the specified one is mapped. Here, the approximate mapping at the m-th level in the resolution 
hierarchy is denoted by F< m ). 

[0091] The approximate mapping F is determined in the following manner. First, the mapping for several pixels are 
specified. When n s pixels 

P('Wo).P('i Ji ) P(' V1 J n ^ ) (44) 

of the source image are specified, the following values in the equation (45) are determined. 

^ n> (/ 0 Jo) = (Mo)' 

^('iJiMMi).---. (45) 

[0092] For the remaining pixels of the source image, the amount of displacement is the weighted average of the 

displacement of p(i h , j h ) (h=0 r^ -1 ). Namely, a pixel p (iJ) is mapped to the following pixel (expressed by the equation 

(46)) of the destination image. 

0\ /) + 2 (** ~ '* . h ~ J Height h (/, j) 
F^(U) = —^ _ „_ (46) 

where 



weight h iiJ)= V ^- i ^- j) \\ 



total _weight(ij) 



(47) 



where 
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total _ weighty, j) - *5 ' 1 " Jh ~ ff — (48) 

A.0 



10 



[0093] Second, the energy D (m ' s) of the candidate mapping f is changed so that mapping f similar to F( m > has a lower 

energy. Precisely speaking, D (/l *^ is expressed by the equation (49). 

VJ) 



(J) T u/) Z " A 



(49) 



75 



— (50) 



20 



25 



where k, p^0. Finally, the mapping f is completely determined by the above-described automatic computing process 
of mappings. 

[0094] Note that * ' becomes 0 if ft m - s )(i,j) is sufficiently close to F< m >(i,j) i.e., the distance therebetween is equal 
to or less than ^ 



>2(»-m) 



— - (51) 



40 



30 It is defined so because it is desirable to determine each value f< m ' s )(i,j) automatically to fit in an appropriate place in 
the destination image as long as each value ft m - s )(i j) is close to F( m )(i j). For this reason, there is no need to specify 
the precise correspondence in detail, and the source image is automatically mapped so that the source image matches 
the destination image. 

35 [2] Concrete Processing Procedure 

[0095] The flow of the process utilizing the respective elemental techniques described in [1] will be described. 
[0096] Fig. 6 is a flowchart of the entire procedure of the base technology. Referring to Fig. 6, a processing using a 
multiresotutional critical point filter is first performed (S1). A source image and a destination image are then matched 
(S2). S2 is not indispensable, and other processings such as image recognition may be performed instead, based on 
the characteristics of the image obtained at S1 . 

[0097] Fig. 7 is a flowchart showing the details of the process at S1 shown in Fig. 6. This process is performed on 
the assumption that a source image and a destination image are matched at S2: Thus, a source image is first hierar- 
chized using a critical point filter (S10) so as to obtain a series of source hierarchical images. Then, a destination image 
is hierarchized in the similar manner (S11) so as to obtain a series of destination hierarchical images. The order of S10 
and S11 in the flow is arbitrary, and the source image and the destination image can be generated in parallel. 
[0098] Fig. 8 is a flowchart showing the details of the process at S10 shown in Fig. 7. Suppose that the size of the 
original source image is 2 n X2 n . Since source hierarchical images are sequentially generated from one with a finer 
resolution to one with a coarser resolution, the parameter m which indicates the level of resolution to be processed is 
set to n (S100). Then, critical points are detected from the images p( m .°), p(m.i) ? p(n\2) anc j p (m,3) Q f the m-th level of. 
resolution, using a critical point filter (S101), so that the images p( m * 1 .0) ( p (m-i.i) ( p (m-i.2) anc j p(m-i,3) Q f the (m-1)th level 
are generated (S102). Since m=n here, p( m .o) =p<nM) =p(m.2) =p (m,3) =p (n) holds and four types of subimages are thus 
generated from a single source image. 

[0099] Fig. 9 shows correspondence between partial images of the m-th and those of (m-1)th levels of resolution. 
Referring to Fig. 9, respective values represent the intensity of respective pixels. p( m . s ) symbolizes four images p(m, 
0) through p(m.3>, and when generating p(™-i.o>, p(™.s) j s regarded as p< m .°>. For example, as for the block shown in Fig. 
9, comprising four pixels with their pixel intensity values indicated inside, images pf m - 1 .°) f pC^.i), p(">-i.2) an d pO*- 1 - 3 ) 
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acquire "3". "8", "6" and w 10". respectively, according to the rules described in [1.2]. This block at the m-th level is 
replaced at the (m-1)th level by respective single pixels acquired thus. Therefore, the size of the subimaqes at the (m- 
1)th level is 2 m * 1 X2 m - 1 . 

[0100] After m is decremented (S103 in Fig. 8), it is ensured that m is not negative (S104). Thereafter, the process 
returns to S101 , so that subimages of the next level of resolution, i.e., a next coarser level, are generated. The above 
process is repeated until subimages at m=0 (0-th level) are generated to complete the process at S10. The size of the 
subimages at the 0-th level is 1 X 1. 

[0101] Fig. 10 shows source hierarchical images generated at S10 in the case of n=3. The initial source image is 
the only image common to the four series followed. The four types of subimages are generated independently, de- 
pending on the type of a critical point. Note that the process in Fig. 8 is common to S11 shown in Fig. 7, and that 
destination hierarchical images are generated through the similar procedure. Then, the process by S1 shown in Fig. 
6 is completed. ' 
[0102] In the base technology, in order to proceed to S2 shown in Fig. 6 a matching evaluation is prepared. Fig. 11 
shows the preparation procedure. Referring to Fig. 11, a plurality of evaluation equations are set (S30). Such the 
eV (msf Xl0n equations lnc,L, de the energy C {m ' s) concerning a pixel value, introduced in [1.3.2.1], and the energy 
D j concerning the smoothness of the mapping introduced in [1.3.2.2]. Next, by combining these evaluation equa- 
tions, a combined evaluation equation is set (S31). Such the combined evaluation equation includes XC (m ' s) + D {m ' $) 
Using r\ introduced in [1.3.2.2], we have VJ) f 



In the equation (52) the sum is taken for each i and j where i and j run through 0, 1 2^. Now, the preparation for 

matching evaluation is completed. 

[0103] Fig. 12 is a flowchart showing the details of the process of S2 shown in Fig. 6. As described in [1], the source 
hierarchical images and destination hierarchical images are matched between images having the same level of reso- 
lution. In order to detect global corresponding correctly, a matching is calculated in sequence from a coarse level to a 
fine level of resolution. Since the source and destination hierarchical images are generated by use of the critical point 
filter, the location and intensity of critical points are clearly stored even at a coarse level. Thus, the result of the global 
30 matching is far superior to the conventional method. 

[0104] Referring to Fig. 12, a coefficient parameter 11 and a level parameter m are set to 0 (S20). Then, a matching 
is computed between respective four subimages at the m-th level of the source hierarchical images and those of the 
destination hierarchical images at the m-th level, so that four types of submappings fl m . s ) (s=0, 1, 2, 3) which satisfy 
the BC and minimize the energy are obtained (S21). The BC is checked by using the inherited quadrilateral described 
in [1.3.3]. In that case, the submappings at the m-th level are constrained by those at the (m-1)th level, as indicated 
by the equations (17) and (18). Thus, the matching computed at a coarser level of resolution is used in subsequent 
calculation of a matching. This is a vertical reference between different levels. If m=0, there is no coarser level and the 
process, but this exceptional process will be described using Fig. 13. 

[0105] On the other hand, a horizontal reference within the same level is also performed. As indicated by the equation 
40 (20) in [1.3.3], f(m.3) f f(m,2) and f{m,i) are respectively determined so as to be analogous to « m .2), flm,i) and flm.o). This 
is because a situation in which the submappings are totally different seems unnatural even though the type of critical 
points differs so long as the critical points are originally included in the same source and destination images. As can 
been seen from the equation (20), the closer the submappings are to each other, the smaller the energy becomes, so 
that the matching is then considered more satisfactory. 
45 [0106] As for which is to be initially determined, a coarser level by one is referred to since there is no other 
submapping at the same level to be referred to as shown in the equation (19). In the experiment, however, a procedure 
is adopted such that after the submappings were obtained up to ft™* 3 ), ft^.o) j S renewed, once utilizing the thus obtained 
subamppings as a constraint. This procedure is equivalent to a process in which s=4 is substituted into the equation 
(20) and f0M> is set to ft^.o) anew. The above process is employed to avoid the tendency in which the degree of 
association between ti m *°) anc | f(m,3) becomes too low. This scheme actually produced a preferable result. In addition 
to this scheme, the submappings are shuffled in the experiment as described in [1.7.1], so as to closely maintain the 
degrees of association among submappings which are originally determined independently for each type of critical 
point. Furthermore, in order to prevent the tendency of being dependent on the starting point in the process, the location 
thereof is changed according to the value of s as described in [1 .7]. 

[0107] Fig. 13 illustrates how the submapping is determined at the 0-th level. Since at the 0-th level each sub-image 
is constituted by a single pixel, the four submappings is automatically chosen as the identity mapping. Fig. 14 
shows how the submappings are determined at the first level. At the first level, each of the sub-images is constituted 
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of four pixels, which are indicated by a solid line. When a corresponding point (pixel) of the point (pixel) x in p( 1 - s ) is 
searched within q< 1 - s ), the following procedure is adopted. 

1 . An upper left point a, an upper right point b, a lower left point c and a lower right point d with respect to the point 
5 x are obtained at the first level of resolution. 

2. Pixels to which the points a to d belong at a coarser level by one, i.e., the 0-th level, are searched. In Fig. 14, 
the points a to d belong to the pixels A to D, respectively. However, the points A to C are virtual pixels which do 
not exist in reality. 

3. The corresponding points A' to D' of the pixels A to D, which have already been defined at the 0-th level, are 
10 plotted in q< 1 - s ). The pixels A 1 to C are virtual pixels and regarded to be located at the same positions as the pixels 

AtoC. 

4. The corresponding point a f to the point a in the pixel A is regarded as being located inside the pixel A', and the 
point a 1 is plotted. Then, it is assumed that the position occupied by the point a in the pixel A (in this case, positioned 
at the upper right) is the same as the position occupied by the point a 1 in the pixel A'. 

15 5. The corresponding points b' to d* are plotted by using the same method as the above 4 so as to produce an 

inherited quadrilateral defined by the points a* to d\ 

6. The corresponding point x' of the point x is searched such that the energy becomes minimum in the inherited 
quadrilateral. Candidate corresponding points x* may be limited to the pixels, for instance, whose centers are 
included in the inherited quadrilateral. In the case shown in Fig. 14, the four pixels all become candidates. 

20 

[0108] The above described is a procedure for determining the corresponding point of a given point x. The same 
processing is performed on all other points so as to determine the submappings. As the inherited quadrilateral is 
expected to become deformed at the upper levels (higher than the second level), the pixels A 1 to D' will be positioned 
apart from one another as shown in Fig. 3. 
25 [0109] Once the four submappings at the m-th level are determined in this manner, m is incremented (S22 in Fig. 
12). Then, when it is confirmed that m does not exceed n (S23), return to S21. Thereafter, every time the process 
returns to S21, submappings at a finer level of resolution are obtained until the process finally returns to S21 at which 
time the mapping f< n > at the n-th level is determined. This mapping is denoted as f< n )(Ti=0) because it has been deter- 
mined relative to r\=0. 

30 [01 10] Next, to obtain the mapping with respect to other different t|, r\ is shifted by Ati and m is reset to zero (S24). 
After confirming that new tj does not exceed a predetermined search-stop value T\ max (S25), the process returns to S21 
and the mapping f< n > (t|=Ati) relative to the new i\ is obtained. This process is repeated while obtaining f< n >(r|=/ATi) 
(^0,1 ,...) at S21. When r\ exceeds the process proceeds to S26 and the optimal n=T| 0 pt »s determined using a 
method described later, so as to let fl n >(Tl = Tlopt) be ^ e final mapping fl n >. 

35 [0111] Fig. 15 is a flowchart showing the details of the process of S21 shown in Fig. 12. According to this flowchart, 
the submappings at the m-th level are determined for a certain predetermined r\. When determining the mappings, the 
optimal X is defined independently for each submapping in the base technology. 

[0112] Referring to Fig. 1 5, s and X are first reset to zero (S210). Then, obtained is the submapping fl m - s > that mini- 
mizes the energy with respect to the then X (and, implicitly, r\) (S211), and the thus obtained is denoted as f< m - s >(X=0). 
40 in order to obtain the mapping with respect to other different X, X is shifted by AX. After confirming that new X does not 
exceed a predetermined search-stop value X^ (S213), the process returns to S211 and the mapping f* m « s > (X=AX) 
relative to the new X is obtained. This process is repeated while obtaining tf m ' s )(X=/AX)(r=0, 1 ,...). When X exceeds X^, 
the process proceeds to S214 and the optimal X=X opt is determined , so as to let fl n > (X=X opt ) be the final mapping f* m ' s > 
(S214). 

45 [01 13] Next, in order to obtain other submappings at the same level, X is reset to zero and s is incremented (S215). 
After confirming that s does not exceed 4 (S2 16), return to S21 1 . When s=4, fl m .°) is renewed utilizing ft m - 3 > as described 
above and a submapping at that level is determined. 

[0114] Fig. 16 shows the behavior of the energy d m ' s) corresponding to f< m . s >(X=/AX)(f=0,1 ,...) for a certain m and 
s while varying X. Though described in [1.4], as X increases, & m ' s) normally decreases but changes to increase after 
50 x exceeds the optimal value. In this base technology, X in whicfi & m ' 8) becomes the minima is defined as X^. As 
observed in Fig. 16, even if d m ' s) turns to decrease again in the range X>\> pt , the mapping will be spoiled by then 
and becomes meaningless. For mis reason, it suffices to pay attention to the first occurring minima value. X^ is inde- 
pendently determined for each submapping including fl n >. 

[0115] Fig. 17 shows the behavior of the energy & n) corresponding to K n )[r\=iAr\)(i=0,1,...) while varying r\. Here 
55 too, C* n) normally decreases as r\ increases, but C*"* changes to increase after r\ exceeds the optimal value. Thus, r\ 
in whicn d n) becomes the minima is defined as r| 0 f pt . Fig. 17 can be considered as an enlarged graph around zero 
along the horizontal axis shown in Fig. 4. Once is determined, fl n ) can be finally determined. 
[0116] As described above, this base technology provides various merits. First, since there is no need to detect 
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edges, problems in connection with the conventional techniques of the edge detection type are solved. Furthermore 
prior knowledge about objects included in an image is not necessitated, thus automatic detection of corresponding 
points is achieved. Using the critical point filter, it is possible to preserve intensity and locations of critical points even 
at a coarse level of resolution, thus being extremely advantageous when applied to the object recognition, characteristic 
extraction, and image matching. As a result, it is possible to construct an image processing system which siqnificantlv 
reduces manual labors. 7 

[0117] Some extensions to or modifications of the above-described base technology may be made as follows* 
(1) Parameters are automatically determined when the matching is computed between the source and destination 
hierarchical images in the base technology. This method can be applied not only to the calculation of the matching 
between the hierarchical images but also to computing the matching between two images in general. 
[0118] For instance, an energy E 0 relative to a difference in the intensity of pixels and an energy E n relative to a 
positional displacement of pixels between two images may be used as evaluation equations, and a linear sum of these 
equations, i.e., E tot =aE 0 +E 1( may be used as a combined evaluation equation. While paying attention to the neighbor- 
hood of the extrema in this combined evaluation equation, a is automatically determined. Namely, mappings which 
minimize E lot are obtained for various a's. Among such mappings, a at which E tot takes the minimum value is defined 
as an optimal parameter. The mapping corresponding to this parameter is finally regarded as the optimal maDDinq 
between the two images. " 

[0119] Many other methods are available in the course of setting up evaluation equations. For instance, a term which 
becomes larger as the evaluation result becomes more favorable, such as 1/E, and 1/E 2 , may be employed. A combined 
evaluation equation is not necessarily a linear sum, but an n-powered sum (n=2, 1/2, -1, -2, etc.), a polynomial or an 
arbitrary function may be employed when appropriate. 

[0120] The system may employ a single parameter such as the above a, two parameters such as r\ and X in the 
base technology or more than two parameters. When there are more than three parameters used, they are determined 
while changing one at a time. 

i?m l 2) the base techno,0 9y' a Parameter is determined in such a manner that a point at which the evaluation equation 
V ' constituting the combined evaluation equation takes the minima is detected after the mapping such that the value 
of the combined evaluation equation becomes minimum is determined. However, instead of this two-step processing 
a parameter may be effectively determined, as the case may be, in a manner such that the minimum value of a combined 
evaluation equation becomes minimum. In that case, aE 0 +pE 1f for instance, may be taken up as the combined eval- 
uation equation, where a+p=1 is imposed as a constraint so as to equally treat each evaluation equation. The essence 
of automatic determination of a parameter boils down to determining the parameter such that the energy becomes 
minimum. 

(3) In the base technology, four types of submappings related to four types of critical points are generated at each level 
of resolution. However, one, two, or three types among the four types may be selectively used. For instance, if there 
exists only one bright point in an image, generation of hierarchical images based solely on K m «3) related to a maxima 
point can be effective to a certain degree. In this case, no other submapping is necessary at the same level, thus the 
amount of computation relative on s is effectively reduced. 

(4) In the base technology, as the level of resolution of an image advances by one through a critical point filter, the 
number of pixels becomes 1/4. However, it is possible to suppose that one block consists of 3X3 pixels and critical 
points are searched in this 3X3 block, then the number of pixels will be 1/9 as the level advances by one. 

(5) When the source and the destination images are color images, they are first converted to monochrome images, 
and the mappings are then computed. The source color images are then transformed by using the mappings thus 
obtained as a result thereof. As one of other methods, the submappings may be computed regarding each RGB com- 
ponent. 

Preferred Embodiments for Image Interpolation 

[0121] The image interpolation techniques based on the above-described base technology will be described here. 
Firstly, efficient compression of a corresponding point file as a result of a mesh introduced will.be described, and 
thereafter an image interpolation apparatus will be described with reference to Fig. 23. 

[0122] Fig. 18 shows a first image 11 and a second image 12, where certain pixels p 1 (x 1 , y^ and p 2 (x 2 , y 2 ) correspond 
therebetween. The correspondence of these is obtained in the base technology. 

[0123] Referring to Fig. 19, when a mesh is provided on the first image 11, corresponding positions are shown on 
the second image 12 of a polygon which constitutes the mesh. Now, a polygon R1 of interest on the first image 11 are 
determined by four lattice points A, B, C and D. Let this polygon R1 be called a "source polygon." As have been shown 
in Fig. 18, these lattice points A, B, C and D have respectively corresponding points A\ B\ C and D* on the second 
image 12, and a polygon R2 formed thus by the corresponding points is called a "destination polygon." The source 
polygon is generally a rectangle while the destination polygon is generally a quadrilateral. In any event, according to 
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the present embodiment, the correspondence relation between the first and second images is not described pixel by 
pixel, instead, the corresponding pixels are described with respect to the lattice points of the source polygon. Such a 
description is written in a corresponding point file. By directing attentions to the lattice points, a capacity for the corre- 
sponding point file can be reduced significantly. 

5 [0124] The corresponding point file is utilized for generating an intermediate image of the first image 11 and the 
second image 12. As this was described in the base technology section, intermediate images at arbitrary temporal 
position can be generated by interpolating positions between the corresponding points. Thus, storing the first image 
11, the second image \2 and the corresponding point file enables generating the morphing between two images and 
smooth motion pictures thereto, thus obtaining compression effect on the motion pictures. 

10 [0125] Fig. 20 shows a method by which to compute the correspondence relation regarding points other than the 
lattice points, from the corresponding point file. Since in the corresponding point file there exits information on the 
lattice points only, data corresponding to interior points of the polygon need to be computed separately. Fig. 20 shows 
correspondence between a triangle ABC which corresponds to a iower half of the source polygon R1 shown in Fig. 19 
and a triangle A'B'C which corresponds to that of the destination polygon R2 shown in Fig. 19. Now, suppose that an 

15 intersection point of a line segment AC and an extended line of BQ to the AC through an interior point Q of the triangle 
ABC interior-divides the line segment AC in the ratio t:(1-t) and the point Q interior-divides a line segment connecting 
such the AC interior-dividing point and a point B in the ratio s:(1-s). Similarly, an intersection point of a line segment 
A'C and an extended line of B'Q' to the AC 1 through a corresponding point Q\ which corresponds to the point Q, in a 
triangle A'B'C* in a destination polygon side interior-divides the line segment A'C\ in the ratio t:(1-t) and the point Q' 

20 interior-divides a line segment connecting such the A'C interior-dividing point and a point B* corresponding to B in the 
ratio s:(1-s). Namely, it is preferable that the source polygon is divided into a triangle, and interior points of the destination 
polygon are determined in the forms of interior division of the vector concerning the triangle. When expressed in a 
vector skew field, it becomes 

25 BQ = (1-sK(1-t)B/\ + tBC}, 



30 



thus, we have 



B'Q 1 = (1-s){(1-t)B'A' + tB'C} 



[0126] Of course, the similar processing will be performed on a triangle ACD which is an upper half of the source 
polygon R1 shown and a triangle A'C'D' which is that of the destination polygon R2. 

35 [0127] Fig. 21 shows the above-described processing procedure. Firstly, as shown in Fig. 19, the matching results 
on the lattice points taken on the first image 11 are acquired (S10). Then, it is preferable that the pixel-by-pixel matching 
according to the base technology is performed, so that a portion corresponding to the lattice points is extracted from 
those results. It is to be noted that the matching results on the lattice points may be specified based on other matching 
techniques such as the optical flow and block matching, instead of using the base technology. 

40 [0128] Thereafter, a destination polygon is defined on the second image 12 (S12), as shown iri the right side of Fig. 
19. Since the above procedure completes generation of the corresponding point file, data by which to identify the first 
image 11 are incorporated to this corresponding point file and are outputted therefrom (S14). Namely, instead of two 
images, only one of images is associated with the corresponding point data. The first image 11 and corresponding point 
file are stored in arbitrary recording device or medium, or may be transmitted directly via a network or broadcast wave. 

45 [0129] Fig. 22 shows a procedure to generate intermediate images by using the corresponding point file. Firstly, the 
first image 1 1 only is read in (S20), and then the corresponding point file is read in (S22). Thereafter, the correspondence 
relation between points in the source polygon and those of the destination polygon is computed by a method shown 
in Fig. 20 (S24). At this time, the correspondence relation on all pixels within the image can be acquired. As described 
in the base technology, the coordinates and colors of points corresponding to each other are interior-divided in the 

so ratio u:(1-u), so that an intermediate image in a position which interior-divides temporally in the ratio u:(1-u) between 
the first image 11 and the second image 12 can be generated (S26). However, different from the base technology, the 
colors are not interpolated, and the color of each pixel of the first image 11 is simply used as such without any alteration 
thereto. It is to be noted that not only the interpolation but also extrapolation may be performed. 
[0 1 30] Fig. 23 shows a structure of an image interpolation apparatus 1 0 which performs the above-described process- 

55 ing. The apparatus 10 includes: an image input unit 12 which acquires the first image 11 and second image 12 from an 
external storage, a photographing camera and so forth; a matching processor 14 which performs a matching compu- 
tation on these images using the base technology or other techniques, a corresponding point file storage 16 which 
stores the corresponding point file F generated by the matching processor 14, an intermediate image generator 18 
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which generates an intermediate image from the first image 11 and the corresponding point file F, and a display unit 
20 which displays the first image 11 and the intermediate image as images close to the original motion picture by 
adjusting the timing thereof. It is to be noted that the display unit 20 may display the second image 12 in the end of the 
display. Moreover, a communication unit 22 sends out the first image 11 and the corresponding point file F to a trans- 
mission infrastructure such as a network or others according to a request from an external side. Then, the second 
image 12 may be sent out too. It is to be noted that mesh data, which indicate the size of the mesh, the positions of the 
lattice points and so forth, are being inputted in the matching processor 14. 

[0131] By implementing the above-described structure, the first image 11 and the second image 12 which were in- 
putted in the image input unit 12 are sent to the matching processor 14. The matching processor performs pixel-by- 
pixel matching computation in between the images. Namely, both the images are referred to at the matching stage. 
The matching processor 14 generates the corresponding point file F based on the mesh data, and the thus generated 
file F is outputted to the corresponding point file storage 16. 

[0132]- The corresponding point file F records corresponding positions at the second image 12 which corresponds to 
the first image 11 about the lattice points taken on the first image 11 . For example, when an image size is of 100 X 100, 
a pair of corresponding points are determined by designating a single point about each lattice point among 10000 
combinations. 

[0133] The corresponding file F further stores color difference data for the pair of corresponding points. In a case 
where the base technology is utilized, correspondence is likely to be made easier for points whose colors are closer 
to each other, so that it is expected that the color difference for the pair of corresponding points is small. Also, in a 
case where the matching technique other than one according to the base technology is utilized, the color difference 
for the pair of corresponding points will be small in general. Thus, even in the case where, for example, the color of 
each point is expressed by 8 bits, it is possible that the difference data may be expressed by, for example, 3 bits. 
Moreover, in a case where a color difference is large to the extent that the color difference exceeds the thus determined 
bit number, the accuracy of the matching should have been primarily considered instead, so that serious errors would 
not be caused in many cases even if the color is expressed by the maximum value in 3 bits, namely, "111". In a case 
where a sufficiently large range is anticipated or assumed for the color difference too, it may be also employed that 
the color is constrained to the 3 bits as a total by quantizing the range in a relatively coarse manner. 
[0134] It is to be noted that the difference data usually shows a statistical bias with a center at zero. Thus, it is 
preferable that the difference data are entropy-coded and thereafter stored in the corresponding point file F. 
[01 35] Moreover, the matching processor 14 may entropy-code the difference data and thereafter stores the entropy- 
coded difference data in the corresponding point file F. 

[0136] The intermediate image generator 18 reads out the corresponding point file F upon request from a user or 
due to other factors, and generates an intermediate image. At this interpolation stage, one of the two images, positional 
information on the pair of the corresponding points and the color difference data are utilized. This intermediate image 
is sent to the display unit 20, where the time adjustment of image output is performed, so that motion pictures or 
morphing images are displayed. As evident from this operation, the intermediate image generator 18 and the display 
unit 20 may be provided in a remote terminal side which is separated from the apparatus 10, and in that case the 
terminal can receive relatively light data comprised of the first image 11 and the corresponding point file F and can 
independently reproduce the motion pictures. 

[0137] Moreover, the intermediate image generator 18 may be such that a point aimed within the first image is moved 
according to the positional information, and a position and pixel value of a point which corresponds to the aimed point 
is determined in the intermediate image by also varying the pixel value of the aimed point based on the difference data. 
[0138] The communication unit 22 is structured and provided accordingly on the assumption that there is provided 
a remote terminal, and the communication unit 22 sends out the first image 11 and the corresponding point file F via a 
network or broadcast wave, so that motion pictures can be displayed at the remote terminal side. Of course, the remote 
terminal may be provided for the purpose of storage instead of display. 

[0139] Actual experiment was earned out according to the processing contents of the present embodiments. For 
example, the size of 256 X 256 or a similar size thereto is adopted as those of the first image and second image, and 
satisfactory morphing or satisfactory motion picture compression effect is obtained by setting the lattice points at in- 
tervals of 10 to some tens of pixels in the vertical and horizontal directions. The size of the corresponding point file is 
of some kilo to 10 kilo bytes or so, and it is confirmed that high image quality and smallness of the data amount are 
achieved. 

[0140] Moreover, though the present embodiments have been described in the recognition where they are advan- 
tageous for terminals such as portable phones and the like, the present invention can be applied to arbitrary image 
related equipment in general. 

[0141] Although the present invention has been described by way of exemplary embodiments, it should be under- 
stood that many changes and substitutions may be made by those skilled in the art without departing from the scope 
of the present invention which is defined by the appended claims. 
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[0142] In so far as the embodiments of the invention described above are implemented, at least in part, using soft- 
ware-controlled data processing apparatus, it will be appreciated that a computer program providing such software 
control and a storage medium by which such a computer program is stored are envisaged as aspects of the present 
invention. 

[0143] Further inventive features of the present embodiments are defined in the following paragraphs: 

1. An image interpolation method, comprising: 

acquiring a matching result computed between a first image and a second image; and 
generating an intermediate image of the first image and second image without referring to the second image 
at this stage, by acting the matching result upon the first image and thus by varying position and value of pixels 
included in the first image. 

2. Computer software having program code for carrying out a method according to paragraph 1. 



Claims 

1. An image interpolation method at an encoding end, comprising: 

acquiring a first image and a second image; and 

computing a matching between the first image and the second image acquired, and detecting points which 
correspond between the images, so as to generate a corresponding point file, 

wherein, in addition to positional information on the corresponding points, difference data on pixel values of 
the corresponding points are stored in the corresponding point file. 

2. An image interpolation method according to Claim 1 , wherein the number of bits less than that utilized in expressing 
a pixel value of each point in the first image and the second image is assigned to the difference data. 

3. An image interpolation method at a decoding end, comprising: 

acquiring a corresponding point file which describes a matching result of a first image and a second image; and 
generating an intermediate image of the first image and the second image by performing interpolation thereon 
based on the corresponding point file, 

wherein the corresponding point file includes: 
positional information on points which correspond between the first image and the second image; and difference 
data of pixel values thereof, and 

wherein, in said generating, the intermediate image is generated based on the first image, the positional 
information and the difference data. 

4. An image interpolation apparatus at an encoding end, comprising: 

an image input unit (12) which acquires a first image and a second image; and 

a matching processor (14) which computes a matching between the first image and the second image thus 
acquired and which generates a corresponding point file by detecting points that correspond between the 
images, 

wherein, in addition to positional information on the points that correspond between the images, difference 
data of pixel values thereof are stored in the corresponding point file by said matching processor (14). 

5. An image interpolation apparatus according to Claim 4, wherein said matching processor (14) detects points on 
the second image that corresponds to lattice points of a mesh provided on the first image, and based on a thus 
detected result a destination polygon corresponding to the second image is defined on a source polygon that 
constitutes the mesh on the first image. 



6. An image interpolation apparatus at a decoding end, comprising: 
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a communication unit (22) which acquires a corresponding point file which describes a matching result of a 
first image and a second image; and 

an intermediate image generator (18) which generates an intermediate image of the first image and the second 
image by performing interpolation thereon based on the corresponding point file, 

wherein the corresponding point file includes: 
positional information on points which correspond between the first image and the second image; and difference 
data of pixel values thereof, and 

wherein said intermediate image generator (18) generates the intermediate image based on the first image 
the positional information and the difference data. 

7. An image interpolation apparatus according to Claim 6, further comprising a display unit (20) which displays at 
least the intermediate image. 

8. An image interpolation apparatus according to one of Claims 6 and 7, further comprising a corresponding point 
file storage (16) which records in a manner such that the corresponding point file is associated to the first image. 

9. An image interpolation apparatus according to one of Claims 6-8, wherein said intermediate image generator (18) 
is such that a point aimed within the first image is moved according to the positional information, and a position 
and pixel value of a point which corresponds to the aimed point is determined in the intermediate image by varying 
the pixel value of the aimed point based on the difference data. 

10. Computer software having program code for carrying out a method according to Claim 1 . 
11- Computer software having program code for carrying out a method according to Claim 3. 

12. An image interpolation method according to Claim 1 , wherein the difference data are entropy-coded and, thereafter, 
the entropy-coded difference data are stored in the corresponding point file. 

13. An image interpolation apparatus according to Claim 4, wherein said matching processor (14) entropy-codes the 
difference data and thereafter stores the entropy-coded difference data in the. corresponding point file. 
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